Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continue counting other perf events group, even if there is an error in one #8

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

kwisniewski98
Copy link
Owner

Signed-off-by: Wisniewski, Krzysztof2 [email protected]

@katarzyna-z
Copy link

Is it possible to add tests to cover changes which you've made in the code?

@kwisniewski98 kwisniewski98 force-pushed the continue_perf_group_counting branch 2 times, most recently from c6b3e9a to 987a8b1 Compare October 12, 2020 09:22
Copy link

@katarzyna-z katarzyna-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @Creatone please make also a review

@kwisniewski98 kwisniewski98 force-pushed the continue_perf_group_counting branch from 987a8b1 to c699255 Compare October 19, 2020 09:03
@@ -197,6 +199,34 @@ func TestNewCollector(t *testing.T) {
assert.Same(t, &perfCollector.events.Core.CustomEvents[0], perfCollector.eventToCustomEvent[Event("event_2")])
}

func TestCollectorSetup(t *testing.T) {
path, err := mockSystemDevices()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mockSystemDevices create a directory similar to /sys/devices/ with some uncore PMUs.
Perf collector needs cgroup path to pass it's identity to perfEventOpen and ioctlSetInt.
Why does this work? Because in this scenario you need only a directory with reading rights.

Try using ioutil.TempDir("", "cgroup")

@@ -20,6 +20,8 @@ package perf
import (
"bytes"
"encoding/binary"
"golang.org/x/sys/unix"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate this import.

Core: Events{
Events: []Group{
{[]Event{"cache-misses"}, false},
{[]Event{"non-existing-event"}, false},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good one! 👍

leaderFileDescriptors, err := c.createLeaderFileDescriptors(group.events, cgroupFd, groupIndex, leaderFileDescriptors)
if err != nil {
klog.Error(err)
continue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot about cleaning after the group. This leads to errors when reading stats.

Consider using this function

func (c *collector) deleteGroup(index int) {
	for name, files := range c.cpuFiles[index].cpuFiles {
		for cpu, file := range files {
			klog.V(5).Infof("Closing perf_event file descriptor for cgroup %q, event %q and CPU %d", c.cgroupPath, name, cpu)
			err := file.Close()
			if err != nil {
				klog.Warningf("Unable to close perf_event file descriptor for cgroup %q, event %q and CPU %d", c.cgroupPath, name, cpu)
			}
		}
	}
	delete(c.cpuFiles, index)
}

This is part of code from Destroy(). So it can look now like:

func (c *collector) Destroy() {
	c.uncore.Destroy()
	c.cpuFilesLock.Lock()
	defer c.cpuFilesLock.Unlock()

	for i, _ := range c.cpuFiles {
		c.deleteGroup(i)
	}
}

@@ -202,7 +202,7 @@ func (c *uncoreCollector) setup(events PerfEvents, devicesPath string) error {
}

if err != nil {
return err
klog.Error(err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, count the entire group or skip it. So in this scenario, you need also to delete the group like in collector.

} else {
config, err := c.createConfigFromEvent(event)
if err != nil {
return nil, fmt.Errorf("unable create config from perf event %v", err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got: unable create config from perf event unable to transform event name event-that-not-exists to perf_event_attr: -4
Can it look better?

config := c.createConfigFromRawEvent(customEvent)
leaderFileDescriptors, err = c.registerEvent(eventInfo{string(customEvent.Name), config, cgroupFd, groupIndex, isGroupLeader}, leaderFileDescriptors)
if err != nil {
return nil, fmt.Errorf("unable to register perf event %v", err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to config error.

}
leaderFileDescriptors, err = c.registerEvent(eventInfo{string(event), config, cgroupFd, groupIndex, isGroupLeader}, leaderFileDescriptors)
if err != nil {
return nil, fmt.Errorf("unable to register perf event %v", err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to config error.

@kwisniewski98 kwisniewski98 force-pushed the continue_perf_group_counting branch from c699255 to 104d8aa Compare October 20, 2020 10:10
@@ -202,9 +202,15 @@ func (c *uncoreCollector) setup(events PerfEvents, devicesPath string) error {
}

if err != nil {
return err
klog.Error(err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this blank line.

@Creatone
Copy link

E1020 15:34:19.117335  256842 uncore_libpfm.go:205] unable to transform event name something to perf_event_attr: -4
E1020 15:34:19.156177  256842 collector_libpfm.go:201] cannot create config from perf event: unable to transform event name something to perf_event_attr: -4

Maybe we can unify logs in this PR? What do you think?

@kwisniewski98 kwisniewski98 force-pushed the continue_perf_group_counting branch 2 times, most recently from 32dcffb to 60a7f41 Compare October 26, 2020 09:45
@kwisniewski98 kwisniewski98 force-pushed the continue_perf_group_counting branch from 60a7f41 to 5f78d87 Compare October 30, 2020 10:06
Signed-off-by: Wisniewski, Krzysztof2 <[email protected]>
@kwisniewski98 kwisniewski98 force-pushed the continue_perf_group_counting branch from 73d7734 to cf09a32 Compare November 3, 2020 09:51
@kwisniewski98 kwisniewski98 force-pushed the continue_perf_group_counting branch 2 times, most recently from 59f8a19 to 7179a92 Compare April 22, 2021 13:18
Signed-off-by: Wisniewski, Krzysztof2 <[email protected]>
@kwisniewski98 kwisniewski98 force-pushed the continue_perf_group_counting branch from 7179a92 to a1dee01 Compare April 28, 2021 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants